SA-REPC - Sequence Alignment with Regular Expression Path Constraint
نویسندگان
چکیده
In this paper, we define a novel variation on the constrained sequence alignment problem, the Sequence Alignment with Regular Expression Path Constraint problem, in which the constraint is given in the form of a regular expression. Our definition extends and generalizes the existing definitions of alignment-path constrained sequence alignments to the expressive power of regular expressions. We give a solution for the new variation of the problem and demonstrate its application to integrate microRNA-target interaction patterns into the target prediction computation. Our approach can serve as an efficient filter for more computationally demanding target prediction filtration algorithms. We compare our implementation for the SA-REPC problem, cAlign, to other microRNA target prediction algorithms.
منابع مشابه
RE-MuSiC: a tool for multiple sequence alignment with regular expression constraints
RE-MuSiC is a web-based multiple sequence alignment tool that can incorporate biological knowledge about structure, function, or conserved patterns regarding the sequences of interest. It accepts amino acid or nucleic acid sequences and a set of constraints as inputs. The constraints are pattern descriptions, instead of exact positions of fragments to be aligned together. The output is an align...
متن کاملCharacterization of the minimum replicon of the broad-host-range plasmid pTF-FC2 and similarity between pTF-FC2 and the IncQ plasmids.
The nucleotide sequence of a 3,202-base-pair fragment which contained the minimum region required for replication of the broad-host-range plasmid, pTF-FC2, has been determined. At least five open reading frames and a region that affected the host range were identified. Proteins corresponding in size and location to four of the five open reading frames were produced in an in vitro transcription-...
متن کاملDetecting conserved secondary structures in RNA molecules using constrained structural alignment
Constrained sequence alignment has been studied extensively in the past. Different forms of constraints have been investigated, where a constraint can be a subsequence, a regular expression, or a probability matrix of symbols and positions. However, constrained structural alignment has been investigated to a much lesser extent. In this paper, we present an efficient method for constrained struc...
متن کاملEfficient Regular Expression Signature Generation for Network Traffic Classification
Regular expression signatures are most widely used in network traffic classification for trusted network management. These signatures are generated by the sequence alignment of the traffic payload. The most commonly used sequence alignment algorithm is Longest Common Subsequence (LCS) algorithm which computes the global similarity between two strings but it fails in consecutive character matche...
متن کاملRegular Expression Constrained Sequence Alignment
Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1, s2 ∈ L(R) where L(R) is the reg...
متن کامل